NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Scalable Decision-Making In Stochastic Environments Through Learned Temporal Abstraction

Luo, Baiting; Pettet, Ava; Laszka, Aron; Dubey, Abhishek; Mukhopadhyay, Ayan (January 2025, International Conference on Learning Representations)

Free, publicly-accessible full text available January 23, 2026
Reinforcement Learning-based Approach for Vehicle-to-Building Charging with Heterogeneous Agents and Long Term Rewards

Liu, Fangqi; Sen, Rishav; Talusan, Jose; Pettet, Ava; Kandel, Aaron; Suzue, Yoshinori; Mukhopadhyay, Ayan; Dubey, Abhishek (December 2024, International Conference on Autonomous Agents and Multi-Agent Systems)

Full Text Available
Act as You Learn: Adaptive Decision-Making in Non-Stationary Markov Decision Processes

Luo, Baiting; Zhang, Yunuo; Dubey, Abhishek; Mukhopadhyay, Ayan (May 2024, 22nd International Conference on Autonomous Agents and Multiagent Systems (AAMAS))

A fundamental (and largely open) challenge in sequential decision-making is dealing with non-stationary environments, where exogenous environmental conditions change over time. Such problems are traditionally modeled as non-stationary Markov decision processes (NSMDP). However, existing approaches for decision-making in NSMDPs have two major shortcomings: first, they assume that the updated environmental dynamics at the current time are known (although future dynamics can change); and second, planning is largely pessimistic, i.e., the agent acts ``safely'' to account for the non-stationary evolution of the environment. We argue that both these assumptions are invalid in practice -- updated environmental conditions are rarely known, and as the agent interacts with the environment, it can learn about the updated dynamics and avoid being pessimistic, at least in states whose dynamics it is confident about. We present a heuristic search algorithm called \textit{Adaptive Monte Carlo Tree Search (ADA-MCTS)} that addresses these challenges. We show that the agent can learn the updated dynamics of the environment over time and then act as it learns, i.e., if the agent is in a region of the state space about which it has updated knowledge, it can avoid being pessimistic. To quantify ``updated knowledge,'' we disintegrate the aleatoric and epistemic uncertainty in the agent's updated belief and show how the agent can use these estimates for decision-making. We compare the proposed approach with the multiple state-of-the-art approaches in decision-making across multiple well-established open-source problems and empirically show that our approach is faster and highly adaptive without sacrificing safety.
more » « less
Full Text Available
Shrinking POMCP: A Framework for Real-Time UAV Search and Rescue

https://doi.org/10.1109/ICAA64256.2024.00016

Zhang, Yunuo; Luo, Baiting; Mukhopadhyay, Ayan; Stojcsics, Daniel; Elenius, Daniel; Roy, Anirban; Jha, Susmit; Maroti, Miklos; Koutsoukos, Xenofon; Karsai, Gabor; et al (October 2024, IEEE)

Full Text Available
Decision Making in Non-Stationary Environments with Policy-Augmented Search

Pettet, Ava; Zhang, Yunuo; Luo, Baiting; Wray, Kyle; Baier, Hendrik; Laszka, Aron; Dubey, Abhishek; Mukhopadhyay, Ayan (May 2024, International Conference on Autonomous Agents and Multiagent Systems)

Sequential decision-making under uncertainty is present in many important problems. Two popular approaches for tackling such problems are reinforcement learning and online search (e.g., Monte Carlo tree search). While the former learns a policy by interacting with the environment (typically done before execution), the latter uses a generative model of the environment to sample promising action trajectories at decision time. Decision-making is particularly challenging in non-stationary environments, where the environment in which an agent operates can change over time. Both approaches have shortcomings in such settings -- on the one hand, policies learned before execution become stale when the environment changes and relearning takes both time and computational effort. Online search, on the other hand, can return sub-optimal actions when there are limitations on allowed runtime. In this paper, we introduce \textit{Policy-Augmented Monte Carlo tree search} (PA-MCTS), which combines action-value estimates from an out-of-date policy with an online search using an up-to-date model of the environment. We prove theoretical results showing conditions under which PA-MCTS selects the one-step optimal action and also bound the error accrued while following PA-MCTS as a policy. We compare and contrast our approach with AlphaZero, another hybrid planning approach, and Deep Q Learning on several OpenAI Gym environments. Through extensive experiments, we show that under non-stationary settings with limited time constraints, PA-MCTS outperforms these baselines.
more » « less
Full Text Available
Calibrating Real-World City Traffic Simulation Model Using Vehicle Speed Data

https://doi.org/10.1109/SMARTCOMP58114.2023.00076

Khaleghian, Seyedmehdi; Neema, Himanshu; Sartipi, Mina; Tran, Toan; Sen, Rishav; Dubey, Abhishek (June 2023, 2023 IEEE International Conference on Smart Computing (SMARTCOMP))

Large-scale traffic simulations are necessary for the planning, design, and operation of city-scale transportation systems. These simulations enable novel and complex transportation technology and services such as optimization of traffic control systems, supporting on-demand transit, and redesigning regional transit systems for better energy efficiency and emissions. For a city-wide simulation model, big data from multiple sources such as Open Street Map (OSM), traffic surveys, geo-location traces, vehicular traffic data, and transit details are integrated to create a unique and accurate representation. However, in order to accurately identify the model structure and have reliable simulation results, these traffic simulation models must be thoroughly calibrated and validated against real-world data. This paper presents a novel calibration approach for a city-scale traffic simulation model based on limited real-world speed data. The simulation model runs a microscopic and mesoscopic realistic traffic simulation from Chattanooga, TN (US) for a 24-hour period and includes various transport modes such as transit buses, passenger cars, and trucks. The experiment results presented demonstrate the effectiveness of our approach for calibrating large-scale traffic networks using only real-world speed data. This paper presents our proposed calibration approach that utilizes 2160 real-world speed data points, performs sensitivity analysis of the simulation model to input parameters, and genetic algorithm for optimizing the model for calibration.
more » « less
Full Text Available
HPRoP: Hierarchical Privacy-Preserving Route Planning for Smart Cities

https://doi.org/10.1145/3616874

Tiausas, Francis; Yasumoto, Keiichi; Talusan, Jose Paolo; Yamana, Hayato; Yamaguchi, Hirozumi; Bhattacharjee, Shameek; Dubey, Abhishek; Das, Sajal K. (August 2023, ACM Transactions on Cyber-Physical Systems)

Route Planning Systems (RPS) are a core component of autonomous personal transport systems essential for safe and efficient navigation of dynamic urban environments with the support of edge-based smart city infrastructure, but they also raise concerns about user route privacy in the context of both privately-owned and commercial vehicles. Numerous high profile data breaches in recent years have fortunately motivated research on privacy-preserving RPS, but most of them are rendered impractical by greatly increased communication and processing overhead. We address this by proposing an approach called Hierarchical Privacy-Preserving Route Planning (HPRoP) which divides and distributes the route planning task across multiple levels, and protects locations along the entire route. This is done by combining Inertial Flow partitioning, Private Information Retrieval (PIR), and Edge Computing techniques with our novel route planning heuristic algorithm. Normalized metrics were also formulated to quantify the privacy of the source/destination points (endpoint location privacy) and the route itself (route privacy). Evaluation on a simulated road network showed that HPRoP reliably produces routes differing only by ≤20% in length from optimal shortest paths, with completion times within ∼ 25 seconds which is reasonable for a PIR-based approach. On top of this, more than half of the produced routes achieved near-optimal endpoint location privacy (∼ 1.0) and good route privacy (≥ 0.8).
more » « less
Full Text Available
Rolling Horizon Based Temporal Decomposition for the Offline Pickup and Delivery Problem with Time Windows

https://doi.org/10.1609/aaai.v37i4.25644

Kim, Youngseo; Edirimanna, Danushka; Wilbur, Michael; Pugliese, Philip; Laszka, Aron; Dubey, Abhishek; Samaranayake, Samitha (June 2023, Proceedings of the AAAI Conference on Artificial Intelligence)

The offline pickup and delivery problem with time windows (PDPTW) is a classical combinatorial optimization problem in the transportation community, which has proven to be very challenging computationally. Due to the complexity of the problem, practical problem instances can be solved only via heuristics, which trade-off solution quality for computational tractability. Among the various heuristics, a common strategy is problem decomposition, that is, the reduction of a large-scale problem into a collection of smaller sub-problems, with spatial and temporal decompositions being two natural approaches. While spatial decomposition has been successful in certain settings, effective temporal decomposition has been challenging due to the difficulty of stitching together the sub-problem solutions across the decomposition boundaries. In this work, we introduce a novel temporal decomposition scheme for solving a class of PDPTWs that have narrow time windows, for which it is able to provide both fast and high-quality solutions. We utilize techniques that have been popularized recently in the context of online dial-a-ride problems along with the general idea of rolling horizon optimization. To the best of our knowledge, this is the first attempt to solve offline PDPTWs using such an approach. To show the performance and scalability of our framework, we use the optimization of paratransit services as a motivating example. Due to the lack of benchmark solvers similar to ours (i.e., temporal decomposition with an online solver), we compare our results with an offline heuristic algorithm using Google OR-Tools. In smaller problem instances (with an average of 129 requests per instance), the baseline approach is as competitive as our framework. However, in larger problem instances (approximately 2,500 requests per instance), our framework is more scalable and can provide good solutions to problem instances of varying degrees of difficulty, while the baseline algorithm often fails to find a feasible solution within comparable compute times.
more » « less
Full Text Available
Synchrophasor Data Event Detection using Unsupervised Wavelet Convolutional Autoencoders

https://doi.org/10.1109/SMARTCOMP58114.2023.00080

Buckelew, Jacob; Basumallik, Sagnik; Sivaramakrishnan, Vasavi; Mukhopadhyay, Ayan; Srivastava, Anurag K.; Dubey, Abhishek (June 2023, 2023 IEEE International Conference on Smart Computing (SMARTCOMP))

Timely and accurate detection of events affecting the stability and reliability of power transmission systems is crucial for safe grid operation. This paper presents an efficient unsupervised machine-learning algorithm for event detection using a combination of discrete wavelet transform (DWT) and convolutional autoencoders (CAE) with synchrophasor phasor measurements. These measurements are collected from a hardware-in-the-loop testbed setup equipped with a digital real-time simulator. Using DWT, the detail coefficients of measurements are obtained. Next, the decomposed data is then fed into the CAE that captures the underlying structure of the transformed data. Anomalies are identified when significant errors are detected between input samples and their reconstructed outputs. We demonstrate our approach on the IEEE-14 bus system considering different events such as generator faults, line-to-line faults, line-to-ground faults, load shedding, and line outages simulated on a real-time digital simulator (RTDS). The proposed implementation achieves a classification accuracy of 97.7%, precision of 98.0%, recall of 99.5%, F1 Score of 98.7%, and proves to be efficient in both time and space requirements compared to baseline approaches.
more » « less
Full Text Available
Mobility-On-Demand Transportation: A System for Microtransit and Paratransit Operations

https://doi.org/10.1145/3576841.3589625

Wilbur, Michael; Coursey, Maxime; Koirala, Pravesh; Al-Quran, Zakariyya; Pugliese, Philip; Dubey, Abhishek (May 2023, Proceedings of the ACM/IEEE 14th International Conference on Cyber-Physical Systems (with CPS-IoT Week)

New rideshare and shared-mobility services have transformed urban mobility in recent years. Therefore, transit agencies are looking for ways to adapt to this rapidly changing environment. In this space, ridepooling has the potential to improve efficiency and reduce costs by allowing users to share rides in high-capacity vehicles and vans. Most transit agencies already operate various ridepooling services including microtransit and paratransit. However, the objectives and constraints for implementing these services vary greatly between agencies. This brings multiple challenges. First, off-the-shelf ridepooling formulations must be adapted for real-world conditions and constraints. Second, the lack of modular and reusable software makes it hard to implement and evaluate new ridepooling algorithms and approaches in real-world settings. Therefore, we propose an on-demand transportation scheduling software for microtransit and paratransit services. This software is aimed at transit agencies looking to incorporate state-of-the-art rideshare and ridepooling algorithms in their everyday operations. We provide management software for dispatchers and mobile applications for drivers and users. Lastly, we discuss the challenges in adapting state-of-the-art methods to real-world operations.
more » « less
Full Text Available

« Prev Next »

Search for: All records